Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: taxonomy patch instead of re-generating #554

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

alexgarel
Copy link
Member

@alexgarel alexgarel commented Oct 31, 2024

The goal is to be able to patch taxonomy text files instead of re-generating them completely.

This will avoid having a lot of changes that are not related to the real modifications made by a contributor.

For this we need to

  • add a "modified" property to entries (to track modified entries)
  • track lines location of entries (to know where to change the original file)
  • keeping removed entries (because we need to remove them from the original file)

Relates to: #541 and #366

TODO:

  • be able to patch taxonomy to generate the PR
  • add tests
  • add modified parameter in search
  • add sort on modified parameter in search (sort is not done at all, for now)
  • test it locally (see taxonomy: Update data_quality taxonomy openfoodfacts-bot/openfoodfacts-server#61)
  • avoid repeating comments when replacing entry
  • put children after parents, not the other way around
  • on entry id change, also re-output children (especially the parent line)

* adding a modified property
* adding lines location of entries
* keeping removed entries

The goal is to be able to patch taxonomy text files instead of re-generating them completely

Relates to: #541 and #366
@alexgarel alexgarel marked this pull request as ready for review November 29, 2024 19:40
@alexgarel
Copy link
Member Author

I still need to do some testing !

@alexgarel
Copy link
Member Author

I did a test openfoodfacts-bot/openfoodfacts-server#61 it results there are still some bugs that I have to solve.

The most problematic is if we change entry id. I think that along the entry last modification timestamp, I will also store the modified attributes, so that I know if entry id was modified.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

1 participant